Asunción
Identifiable Deep Latent Variable Models for MNAR Data
Xie, Huiming, Xue, Fei, Wang, Xiao
Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is independent of the missing values themselves. This assumption is frequently violated in real-world scenarios, prompted by recent advances in imputation methods using deep learning to address this challenge. However, these methods neglect the crucial issue of nonparametric identifiability in missing-not-at-random (MNAR) data, which can lead to biased and unreliable results. This paper seeks to bridge this gap by proposing a novel framework based on deep latent variable models for MNAR data. Building on the assumption of conditional no self-censoring given latent variables, we establish the identifiability of the data distribution. This crucial theoretical result guarantees the feasibility of our approach. To effectively estimate unknown parameters, we develop an efficient algorithm utilizing importance-weighted autoencoders. We demonstrate, both theoretically and empirically, that our estimation process accurately recovers the ground-truth joint distribution under specific regularity conditions. Extensive simulation studies and real-world data experiments showcase the advantages of our proposed method compared to various classical and state-of-the-art approaches to missing data imputation.
- Africa > Botswana (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
On the Number of Conditional Independence Tests in Constraint-based Causal Discovery
Monés, Marc Franquesa, Zhang, Jiaqi, Uhler, Caroline
Learning causal relations from observational data is a fundamental problem with wide-ranging applications across many fields. Constraint-based methods infer the underlying causal structure by performing conditional independence tests. However, existing algorithms such as the prominent PC algorithm need to perform a large number of independence tests, which in the worst case is exponential in the maximum degree of the causal graph. Despite extensive research, it remains unclear if there exist algorithms with better complexity without additional assumptions. Here, we establish an algorithm that achieves a better complexity of $p^{\mathcal{O}(s)}$ tests, where $p$ is the number of nodes in the graph and $s$ denotes the maximum undirected clique size of the underlying essential graph. Complementing this result, we prove that any constraint-based algorithm must perform at least $2^{Ω(s)}$ conditional independence tests, establishing that our proposed algorithm achieves exponent-optimality up to a logarithmic factor in terms of the number of conditional independence tests needed. Finally, we validate our theoretical findings through simulations, on semi-synthetic gene-expression data, and real-world data, demonstrating the efficiency of our algorithm compared to existing methods in terms of number of conditional independence tests needed.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (6 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Data Science > Data Mining (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
- Asia > Middle East > Jordan (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Banking & Finance (1.00)
- Health & Medicine > Therapeutic Area (0.93)
- Law (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
- South America > Paraguay > Asunción > Asunción (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- Oceania > New Zealand (0.04)
- (9 more...)
- Workflow (0.93)
- Research Report > Experimental Study (0.93)
- Energy > Power Industry (0.68)
- Energy > Renewable > Solar (0.47)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (22 more...)
- Education > Curriculum > Subject-Specific Education (0.96)
- Health & Medicine (0.69)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > France (0.04)
- Asia > Taiwan (0.04)
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia (0.04)
- (14 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.45)
- Information Technology (1.00)
- Government (1.00)
- Banking & Finance > Trading (1.00)
- (3 more...)